12 research outputs found
Physically Plausible Full-Body Hand-Object Interaction Synthesis
We propose a physics-based method for synthesizing dexterous hand-object
interactions in a full-body setting. While recent advancements have addressed
specific facets of human-object interactions, a comprehensive physics-based
approach remains a challenge. Existing methods often focus on isolated segments
of the interaction process and rely on data-driven techniques that may result
in artifacts. In contrast, our proposed method embraces reinforcement learning
(RL) and physics simulation to mitigate the limitations of data-driven
approaches. Through a hierarchical framework, we first learn skill priors for
both body and hand movements in a decoupled setting. The generic skill priors
learn to decode a latent skill embedding into the motion of the underlying
part. A high-level policy then controls hand-object interactions in these
pretrained latent spaces, guided by task objectives of grasping and 3D target
trajectory following. It is trained using a novel reward function that combines
an adversarial style term with a task reward, encouraging natural motions while
fulfilling the task incentives. Our method successfully accomplishes the
complete interaction task, from approaching an object to grasping and
subsequent manipulation. We compare our approach against kinematics-based
baselines and show that it leads to more physically plausible motions.Comment: Project page at https://eth-ait.github.io/phys-fullbody-gras
MARLUI: Multi-Agent Reinforcement Learning for Adaptive UIs
Adaptive user interfaces (UIs) automatically change an interface to better
support users' tasks. Recently, machine learning techniques have enabled the
transition to more powerful and complex adaptive UIs. However, a core challenge
for adaptive user interfaces is the reliance on high-quality user data that has
to be collected offline for each task. We formulate UI adaptation as a
multi-agent reinforcement learning problem to overcome this challenge. In our
formulation, a user agent mimics a real user and learns to interact with a UI.
Simultaneously, an interface agent learns UI adaptations to maximize the user
agent's performance. The interface agent learns the task structure from the
user agent's behavior and, based on that, can support the user agent in
completing its task. Our method produces adaptation policies that are learned
in simulation only and, therefore, does not need real user data. Our
experiments show that learned policies generalize to real users and achieve on
par performance with data-driven supervised learning baselines
The Six Hug Commandments: Design and Evaluation of a Human-Sized Hugging Robot with Visual and Haptic Perception
Receiving a hug is one of the best ways to feel socially supported, and the
lack of social touch can have severe negative effects on an individual's
well-being. Based on previous research both within and outside of HRI, we
propose six tenets ("commandments") of natural and enjoyable robotic hugging: a
hugging robot should be soft, be warm, be human sized, visually perceive its
user, adjust its embrace to the user's size and position, and reliably release
when the user wants to end the hug. Prior work validated the first two tenets,
and the final four are new. We followed all six tenets to create a new robotic
platform, HuggieBot 2.0, that has a soft, warm, inflated body (HuggieChest) and
uses visual and haptic sensing to deliver closed-loop hugging. We first
verified the outward appeal of this platform in comparison to the previous
PR2-based HuggieBot 1.0 via an online video-watching study involving 117 users.
We then conducted an in-person experiment in which 32 users each exchanged
eight hugs with HuggieBot 2.0, experiencing all combinations of visual hug
initiation, haptic sizing, and haptic releasing. The results show that adding
haptic reactivity definitively improves user perception a hugging robot,
largely verifying our four new tenets and illuminating several interesting
opportunities for further improvement.Comment: 9 pages, 6 Figures, 2 Tables, ACM/IEEE Human-Robot Interaction (HRI)
Conference 202
SFP: State-free Priors for Exploration in Off-Policy Reinforcement Learning
Efficient exploration is a crucial challenge in deep reinforcement learning. Several methods, such as behavioral priors, are able to leverage offline data in order to efficiently accelerate reinforcement learning on complex tasks. However, if the task at hand deviates excessively from the demonstrated task, the effectiveness of such methods is limited. In our work, we propose to learn features from offline data that are shared by a more diverse range of tasks, such as correlation between actions and directedness. Therefore, we introduce state-free priors, which directly model temporal consistency in demonstrated trajectories, and are capable of driving exploration in complex tasks, even when trained on data collected on simpler tasks. Furthermore, we introduce a novel integration scheme for action priors in off- policy reinforcement learning by dynamically sampling actions from a probabilistic mixture of policy and action prior. We compare our approach against strong baselines and provide empirical evidence that it can accelerate reinforcement learning in long-horizon continuous control tasks under sparse reward settings
Learning Functionally Decomposed Hierarchies for Continuous Control Tasks With Path Planning
We present HiDe, a novel hierarchical reinforcement learning architecture that successfully solves long horizon control tasks and generalizes to unseen test scenarios. Functional decomposition between planning and low-level control is achieved by explicitly separating the state-action spaces across the hierarchy, which allows the integration of task-relevant knowledge per layer. We propose an RL-based planner to efficiently leverage the information in the planning layer of the hierarchy, while the control layer learns a goal-conditioned control policy. The hierarchy is trained jointly but allows for the modular transfer of policy layers across hierarchies of different agents. We experimentally show that our method generalizes across unseen test environments and can scale to 3x horizon length compared to both learning and non-learning based methods. We evaluate on complex continuous control tasks with sparse rewards, including navigation and robot manipulation. © 2021 IEEEISSN:2377-376
Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning
We present HiDe, a novel hierarchical reinforcement learning architecture that successfully solves long horizon control tasks and generalizes to unseen test scenarios. Functional decomposition between planning and low-level control is achieved by explicitly separating the state-action spaces across the hierarchy, which allows the integration of task-relevant knowledge per layer. We propose an RL-based planner to efficiently leverage the information in the planning layer of the hierarchy, while the control layer learns a goal-conditioned control policy. The hierarchy is trained jointly but allows for the composition of different policies such as transferring layers across multiple agents. We experimentally show that our method generalizes across unseen test environments and can scale to tasks well beyond 3x horizon length compared to both learning and non-learning based approaches. We evaluate on complex continuous control tasks with sparse rewards, including navigation and robot manipulation
TempCLR: Reconstructing Hands via Time-Coherent Contrastive Learning
We introduce TempCLR, a new time-coherent contrastive learning approach for the structured regression task of 3D hand reconstruction. Unlike previous time-contrastive methods for hand pose estimation, our framework considers temporal consistency in its augmentation scheme, and accounts for the differences of hand poses along the temporal direction. Our data-driven method leverages unlabelled videos and a standard CNN, without relying on synthetic data, pseudo-labels, or specialized architectures. Our approach improves the performance of fully-supervised hand reconstruction methods by 15.9% and 7.6% in PA-V2V on the HO-3D and FreiHAND datasets respectively, thus establishing new state-of-the-art performance. Finally, we demonstrate that our approach produces smoother hand reconstructions through time, and is more robust to heavy occlusions compared to the previous state-of-the-art which we show quantitatively and qualitatively